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THE CALCULATION OF LINKAGE INTENSITIES 1 

Professor E, A. EMERSON 

Cornell University 

Two methods of estimating the intensity of linkage are 
in use. One consists of crossing individuals heterozy- 
gous for two or more linked genes with homozygous re- 
cessives. This is the more direct method, because the 
gametic ratio— barring differential viability— is exhibited 
directly by the zygotic frequencies. The other method 
employs ordinary F 2 ratios derived from selling F t or 
breeding together like F t individuals. Here the gametic 
ratio can only be inferred from the numerical relation of 
the zygotic classes. The results may be disturbed not 
only by differential viability, as in the first method, but 
also by selective fertilization, if that occurs, and may 
often be materially influenced by chance in random mat- 
ing where the numbers are small. In fact, this method 
is so undesirable that it should not ordinarily be used 
where the other method is practicable. It is true, how- 
ever, that the mechanical difficulties of crossing certain 
plants are so great and the number of seeds produced 
per flower so small that often the ordinary F„ results are 
alone available. It is important, therefore, to have a 
means of calculating gametic ratios from F 2 zygotic 
numbers. 

Since no direct formulae for calculating gametic ratios 
from observed F 2 data have heretofore been available, 
the problem has been attacked in an indirect way. A 
series of F 2 zygotic ratios has first been calculated from a 
corresponding series of gametic ratios. Nest the ob- 
served F 2 results have been compared with the calculated 
series, the closest fitting calculated ratio determined, and 
the corresponding gametic ratio taken as that responsible 
for the observed F, results. 

i Paper No. 54, Department of Plant Breeding, Cornell University, 
Ithaca, N. Y. 
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The method of determining the closeness of fit between 
calculated and observed numbers used by Bateson, Pun- 
nett and their co-workers was mere inspection. (See 
Bateson and Punnett, 1911.) The unreliability of this 
method was pointed out by Collins (1912) who made 
use of Yule's coefficient of association for the same 
purpose. The well-known formula for this coefficient is 
{ad — be) /{ad + be), where a, b, c, d are the frequencies 
of the phenotypic forms AB, Ab, aB, ab, respectively. 
From a table giving the coefficients of association for a 
series of gametic ratios, the best fitting gametic ratio 
is chosen by inspection or interpolation. This method is 
satisfactory except for the higher gametic ratios where 
slight differences in the coefficients of association corre- 
spond to wide differences in the gametic ratios. Since 
the same intensity of linkage gives somewhat higher 
coefficients of association for coupling than for repulsion, 
particularly for the lower linkage values where the asso- 
ciation coefficient method is most reliable, two tables must 
be used. 

Formulas, by which gametic ratios can be approximated 
directly from F 2 data without the use of coefficients of 
association and without respect to whether coupling or 
repulsion is involved, would seem to merit trial. Such 
formulas are presented later in this paper. Moreover, it 
is often desirable to reverse the calculation, that is, to 
determine zygotic frequencies from assumed gametic 
ratios. A single formula suggested for this purpose 
gives accurate results for both coupling and repulsion. 
This formula will be presented first because the others 
are developed from it. 

Bateson and Punnett (1911) suggested two empirical 
formulae for calculating zygotic frequencies from assumed 
gametic ratios, one for coupling and the other for repul- 
sion. Neither one, of course, is applicable to both types 
of linkage, though both formulas are true for independent 
inheritance. If A and a are allelomorphic genes and B 
and b are a similar allelomorphic pair— the capital letters 
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denoting dominance— and if 2n equal the sum of the 
gametic series, 2 then the, gametic series and the pheno- 
typic zygotic series, AB, Ab, aB, ab, for coupling and for 
repulsion are: 



Coupling . 
Bepulsion 



Ab 

n — 1 

1 



Gametic Series 
Ab : aB : ab 

1: 1:»— 1 

i — l:n—l: 1 



Zygotic Series 
Ab : Ab : aB : 

Coupling .. 3«2— (2n — 1) : 2n — 1: 2n — 1: n*- 
Eepulsion . . 2*2 -(- 1 : m2 — 1 : «2 — 1 : 



(2 ? i-l) 

1 



That is, the formulae of Bateson and Punnett are ex- 
pressed in terms of the sum of the gametic series. But 
the same thing can also be expressed in terms of the 
several members of the gametic series. Thus, if r:s is 
any gametic ratio, the usual form of gametic series is 
r:s:s:r and the frequencies of the ten possible genotypic 
classes and of the corresponding four phenotypic classes 
are: 



Genotypes 
AB-AB — ri 

AB-Ab = 2rs 

AB-aB —2rs 

AB-ab =2r2 

Ab-aB = 2«2 



Phenotypes 



, AB = 3r= + irs -f- 2s2 



Ah- Ah 



,?2 



Ab-ab =2rsj 

aB-aB =s2 
aB-ab = 2rs 



Ab = 2rs + 52 
aB — 2rs -f sz 



ab ■ ab ~r2 [- ab =rz 

The general formula for calculating a phenotypic 
zygotic series from a given gametic ratio is, therefore, 

3 r 2 _|_ 2( S 2 _j_ 2rs) : s 2 + Irs : s 2 + 2rs : r 2 (I) 

The sum of the zygotic series is 4r 2 -j- 8rs + 4s 2 or 
(2r + 2s) 2 , which, when expressed as 

(r + s + si-r) (r + s + s + s + r), 

2 Bateson and Punnett considered n to be some power of 2, but this 
limitation need not apply here. 
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indicates how the formula is derived. Reference to the 
diagram will make this clear. Since r and s are any 
positive quantities, formula I is applicable to coupling 
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Diagram Showing in Teems of r and s the Numerical Relations of the 
F 2 Zygotic Classes that Result from Combinations of the Gametic Classes 
AB, Ab, aB, ab Occurring in the Ratio Series r:s:s:r. The dominant genes 
A and B are indicated by horizontal and vertical lines respectively, while their 
allelomorphs a and 6 are indicated by the absence of such lines. (See formula I.) 



(r>s), repulsion (r < s) and to independent inherit- 
ance (r = s). It, of course, gives the same result as the 
empirical formula? of Bateson and Punnett, but is more 
convenient in that one formula, takes the place of the 
two. It is easy to use since the fourth term of the 
zygotic series is the square of r, the second and third 
terms each the square of s plus twice the product of r and 
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s, and the first term the sum of the second and third plus 
three times the fourth. • 

An approximation of gametic ratios can be obtained 
from observed zygotic ratios by simple formulae derived 
from formula I. If the actual values of s 2 + 2rs could 
be assumed to be identical in all cases, it would follow 
from formula I that 4r 2 = AB -\- ab — (Ab + aB) and r 
= yjAW+ab^AT^aB)'/\. _Similarly^ 4 ( s 2 + 2r s ) 
=AB+Ab+aB- 3r 2 and s= V (^.5+^6+aj5+r 2 )/4-r 
=V(i5+ .4 6 + aS + o&)/'4 — r. If .£' is the sum of the 
extreme terms and M the sum of the middle terms of the 
observed zygotic series, the formulas for approximating 
gametic ratios are, then, 3 

r = .5VF=l/ , Tn 

If it is desired to compare the observed F 2 frequencies 
with a calculated series of frequencies, the procedure, 
obviously, is to calculate the gametic ratio by formulae II 
— or by means of the coefficient of association— and then 
to calculate the zygotic series by formula I— or by one of 
the two formulae of Bateson and Punnett. This procedure 
is not always necessary, however, for a theoretical zygotic 
series can usually be readily computed directly from the 
observed frequencies. If AB, Ab, aB, ab is the series to 
be calculated from the observed frequencies, it follows 
from formulae I and II that 

Ab = aB = M/2 

ab=(E-M)/4: (III) 

AB = M + Sab 

Since a zygotic series calculated in this way necessarily 
meets the conditions imposed by formula I, the gametic 
ratio can be approximated from it more readily than from 
the observed frequencies. Since by formulae I and II 



ah = r 2 and s = .5 VE + M 
r = Vab 



(IV) 



s = .5VE + M— Vab 

3 Since r and s are necessarily positive, negative roots are disregarded. 
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Formulae IV are not to be used in connection with ob- 
served F 2 frequencies, except when the latter approximate 
closely the form demanded by formula I, that is, when the 
first term of the observed frequencies equals approxi- 
mately the sum of the second and third terms plus three 
times the fourth term. 

In cases of repulsion, where the fourth term of the 
zygotic series is always relatively small and, therefore, 
where the first term should be only slightly greater than 
the sum of the second and third terms, it may happen that 
the sum of the first and fourth terms, E, is actually less 
than the sum of the second and third terms, M. In such 
cases, formulae II (and consequently formulae III and IV 
also) can not be employed, for, if E is less than M the 
quantity under the radical (E — M ) is negative and has 
no real root. In such cases, the gametic ratio must be 
calculated by means of the coefficient of association. 

The method here suggested for calculating gametic 
ratios from observed frequencies never gives quite the 
same results as that obtained by the association-coefficient 
method except when the observed series approaches 
closely the form demanded by formula I. Naturally, 
then, the more widely the observed frequencies depart 
from this form the greater the difference between the 
results given by the two methods. Since the coefficient of 
association gives reliable results if the tables to be used 
with it are based upon sufficiently small differences in the 
gametic ratios employed in its preparation, it follows that 
the methods proposed in this paper give only approxi- 
mate results. It is also true, therefore, that the nearer 
the observed frequencies approach the form of formula I, 
the closer the approximation obtained by formulae II (or 
III and IV). 

The two methods have been applied to numerous cases 
taken from published accounts of linkage studies and the 
goodness of fit tested by the method suggested by Harris 
(1912). The differences, o — c, between the observed fre- 
quencies, o, and the calculated frequencies, c, of the sev- 
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eral classes are determined and 8[(o — c) 2 /c] = x 2 calcu- 
lated, 8 indicating summation. 

With. 11, the number of classes, here equaling- four, and 
x 2 , the probability, P, that departures from the calculated 
series as great as those observed might occur through 
the errors of random sampling, is obtained by reference 
to Elderton's (1901) table (see also Pearson, 1914). 
Wherever appreciably different gametic ratios have been 
obtained by the two methods, P has been found to be 
greater for the association-coefficient method than for the 
method based on formulae II. The former method has, 
therefore, given the closer fit. Since, in most of the oases 
to which the test has been applied, x 2 is less than one and 
since such values are not listed in Elderton's table, x 2 has 
been used directly for the comparison of the two methods. 
Where n is constant, the larger x 2 the less the probability. 

While this test for goodness of fit has shown the asso- 
ciation-coefficient method to be the better of the two, the 
fact that in most cases x 2 was less than one for both meth- 
ods indicates that the approximate method suggested here 
ordinarily gives results such that the departures of ob- 
served from calculated frequencies might well be due to 
errors of random sampling. The method has been found 
convenient and usually sufficiently accurate where only an 
approximate determination of the gametic ratio is de- 
sired. Where the observed frequencies depart widely 
from the form given by formula I, this method should not 
be used. It should be noted, however, that in such cases 
no calculated series fits the observed results well. This 
limitation to the use of the new method does not lessen 
materially the convenience of using it where it is appli- 
cable. By a mere inspection of the observed frequencies, 
it can usually be told whether they conform fairly closely 
to formula I, that is, whether the first term is approxi- 
mately equal to the sum of the second and third plus three 
times the fourth. 

A few examples will illustrate the use of the approxi- 
mate method of calculating gametic ratios from observed 
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data and afford a means of comparing it with the associ- 
ation-coefficient method. 

Harris (1912) has quoted an example of coupling in 
sweet peas from the studies of Bateson, Saunders, and 
Punnett 4 and calculated P where the gametic ratios are 
taken as 7:1 and 15 : 1, the only ratios considered in the 
original paper. The phenotypic classes are based on shape 
of pollen and color of flowers and the observed frequen- 
cies are purple long 493, purple round 25, red long 25, 
red round 138, total 681. As determined by Harris, on the 
basis of a 7:1 gametic ratio, P = .0053 or x 2 = 12.7699. 
On the 15 : 1 basis, P = .3086 or x 2 = 3.6375. The chances 
against the 7 : 1 ratio are, therefore, 199 to 1 and against 
the 15 : 1 ratio about 2 to 1. For this same material, Col- 
lins (1912), using the association-coefficient method— 
Coef. Assoc. = .982 ± .004— naturally suggested a 12:1 
gametic ratio— Coef. Assoc, also -— .982— and pointed out 
the fact that the deviation from the 7:1 ratio is 9 times 
and from the 15:1 ratio about twice the probable error. 
By formula; III, the calculated series becomes 485.75 
+25.0+25.0+145.25=681. By formulae IV, r = 12,052 and 
s = .996 or a gametic ratio of 12.1 : 1. The 12 : 1 ratio ob- 
tained by the association-coefficient method gives a zygotic 
series of 485.5 + 25.2 + 25.2 + 145.1 = 681. Both meth- 
ods, then, give gametic ratios approximately the same and 
practically identical zygotic series, namely, 485 + 25 
+ 25 + 145. On the basis of this series, a; 2 = .4387 and 
P is so large that it is useless to determine it. In short, 
both methods give gametic ratios that fit the observed 
data extremely well. 

The next example of coupling presents a very different 
condition. It has been quoted hy Bridges (1914) from 
Punnett 's (1913) summary of reduplication series in 
sweet peas. The phenotypic classes are based upon ster- 
ility of anthers and form of flowers and the observed 
frequencies are fertile normal 165, fertile cretin 58, 
sterile normal 58, sterile cretin 78, total 359. It can be 
seen at a glance that these frequencies are far from 

*Rept. Evol. Com., 4: 11. 
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what formula I demands— 58 + 58 + 3(78) =350, over 
twice 165— and that therefore the approximate method 
can not be depended upon in calculating* the gametic 
ratio. It is interesting* to note, however, just how un- 
reliable it is in comparison with the association-coeffi- 
cient method. By formulae III and IV, the calculated 
zygotic series becomes 211 + 58 + 58 + 32 = 359, r = 5.6, 
s = 3.8, and the gametic ratio is approximately 1.5:1. 
Bridges referred the case to a 2:1 ratio (Coef. Assoc. 
= .558), though the coefficient of association is .588 which 
is equivalent to a gametic ratio of 2.1:1 (Coef. Assoc. 
= .586). Punnett compared the observed frequencies 
with a series derived from an assumed 3:1 ratio. The 
zygotic series calculated from these ratios are, for the 
2*:1 ratio, 219 + 50 + 50 + 40 = 359; for the 2.1:1 ratio, 
220 + 49 + 49 + 41 = 359 ; and for the 3 : 1 ratio, 230 + 39' 
+ 39 + 51 = 359. If now the criterion of goodness of fit 
be applied to the four calculated series the values of x 2 ' 
are, for the 1.5 : 1 ratio 76.1, for the 2 : 1 ratio 52.0, for the 
2.1 : 1 ratio 51.4, and for the 3 : 1 ratio 51.3. Values of x 2 
above 30 are not listed in Elderton's table, but where 
x 2 = 30 and »=4, P = . 000,001, which means that there 
is only one chance in a hundred thousand of deviations so 
great as the observed ones being due to the errors of 
random sampling. Where neither of the two methods of 
calculating* the zygotic series gives a better fit than in 
this case, it is immaterial which fit is the worse. 

As an example of repulsion, the same characters, in 
sweet peas may be used. The observed frequencies 
(Bateson and Punnett, 1911) are 336 + 150 + 143 + 11 
= 640. Bateson and Punnett assumed that the gametic 
ratio concerned was 1:3. The coefficient of association 
is —.706, which is equivalent to a gametic ratio of 1 : 2.74. 
By f ormuhe III or HI-TV, a ratio of 1 : 2.45 is indicated. 
The values of x 2 are for the 1 : 2.45 ratio .649, for the 
1 : 2.74 ratio .302, and for the 1 : 3 ratio .536. Here again 
the association-coefficient method gives the better fit, but 
the probability is great that the deviations of the ob- 
served from the calculated frequencies, even in case of 
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the approximate method, might be due to errors of ran- 
dom sampling. 

As an illustration of the fact that the approximate 
method can not be used in some cases of repulsion, even 
when the observed frequencies fit fairly well the series 
calculated by the association-coefficient method, an ex- 
ample of linkage between dark axils and fertile anthers 
in sweet peas quoted from Punnett by Bridges (1914) 
may be taken. The observed frequencies are 1335 + 643 
+ 714 + 2 = 2694. The value of r can not be determined 
by formula? II nor by III and IV, because 1335 + 2 — (643 
+ 714) is a negative quantity (—20) and has no real 
root. The coefficient of association is — .988, which is 
equivalent to a gametic ratio of 1 : 17, though Bridges 
assumed a ratio of 1 : 20. On the basis of this 1 : 20 ratio, 
a; 2 = 5.68 and P = .1309. On the basis of the 1:17 ratio, 
a 2 = 4.04 and P = .2615, or odds of about 3 to 1 against 
the occurrence of deviations as great as those observed. 

It may be said, then, that the formulas suggested here 
afford a convenient method of approximating gametic 
ratios from zygotic series, when the observed frequencies 
are in fair accord with a series based on formula I— or 
the formula? of Bateson and Punnett. When the ob- 
served frequencies are far from this type no method 
gives a close fit between observed and calculated results. 
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